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@ Interactive copying system. 

@ A system for generating new documents (20) 
from originals (4) containing text and/or images 
(22,28) employing e.g. a camera-projector sys- 
tem (6,8) focussed on a work surface (2), in 
conjunction with a copier or printer. In use, the 
camera 6 captures various manual operations 
carried out by the user, e.g. by pointing with 
fingers and tapping on the surface on the text or 
images (22,28) in an original paper document 
(4) on the surface (2) and representing manipu- 
lations of the text or images (22,26). Feedback 
to the user is provided by projection of an image 
(21,24) onto the surface or onto the original, or 
using some other visual display. 
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The present invention relates to interactive image 
reproduction machines, and more particularly repro- 
duction machines for performing various operations 
on text or images during the creation of a new docu- 
ment. 

It is common for office workers and others who 
work with documents on a regular basis to have ef- 
fectively two desks-the "electronic desktop" provided 
by a workstation or personal computer by means of a 
graphical interface, and the physical desk on which 
paper documents are received and processed. 

The electronic desktop, which has become more 
and more like the physical one, can perform numer- 
ous useful operations on documents stored in elec- 
tronic form; but in dealing with tangible documents 
such devices have limitations. A paper document 
must either be converted into electronic form before 
the operations are performed on it in the electronic 
environment, or copying operations are carried out on 
the tangible document using an electrophotographic 
copier (such as a photocopier with an editing func- 
tion) or a combined scanning and printing device, the 
available functions of which are restricted in nature. 

US-A-5,191,440 discloses a photocopier system 
for combining plural image segments taken from a 
series of different documents and printing the series 
of image segments as a composite image on a com- 
mon copy sheet. The documents are sequentially 
scanned in, and the results of page creation and edit 
functions may be previewed on a display screen. 

It is known from EP-A-495 622 to use a camera- 
projector arrangement positioned above a desk, in or- 
der to select functions to be performed by selecting 
items located within the field of view of the camera. 
Such functions include calculating and translating op- 
erations carried out on data [e.g., in a paper docu- 
ment] located on the desk. 

The present invention seeks to reduce the above- 
mentioned limitations on the operations which may be 
performed on data in a paper document, using inter- 
active techniques in which the paper document effec- 
tively becomes part of the means for designating 
which operations (such as text/image selection, cre- 
ation and manipulation) are carried out on the infor- 
mation contained in it, in order to create new docu- 
ments using a processor-controlled copying or print- 
ing device. 

It is an object of the present invention to provide 
an interactive copying system in which the available 
functions are expanded beyond both those provided 
by the conventional electronic environment alone, 
and those provided by existing copying equipment 
alone. 

The present invention provides a copying system, 
comprising: a work surface; means for displaying im- 
ages on the work surface; a camera, focussed on the 
work surface, for generating video signals represent- 
ing in electronic form image information present with- 



in the field of view of the camera; processing means 
for recognising one or more manual operations relat- 
ing to the image information which are executed by a 
user within the field of view of the camera, and for per- 

5 forming electronic operations, corresponding to said 
manual operation(s), on the electronic form to pro- 
duce a modified electronic form; the displaying 
means being adapted to display, under the control of 
the processing means, simultaneously with or subse- 

w quent to said electronic operation performing step, 
images defined by said manual operations; and 
wherein said images defined by said manual opera- 
tions include an image of the newty created docu- 
ment. 

15 The copying system may further including a 

scanner, coupled to the processing means, for scan- 
ning a document containing said image information. 

The system preferably further includes means, 
for sensing vibrational signals on the surface; the 

20 processing means being adapted to recognise a tap 
or strike by a user on the surface. 

Preferably, the processing means includes a 
frame grabber, for storing video frames, and differ- 
encing means, for establishing the difference be- 

25 tween pixel data values of corresponding pixels in 
successive video frames, and for displaying the resul- 
tant video frame data. Preferably, the processing 
means includes thresholding means, for converting 
multi-bit per pixel video frame data to 1 bit per pixel 

30 video frame data. Preferably, the thresholding means 
is adapted for carrying out said converting operation 
based on an estimate equal to the moving average of 
pixel intensities in a local area. Preferably, the local 
area comprises 1/nth of the width of a video frame, 

35 where n is preferably about 8. 

Preferably, the processing means includes a 
frame grabber, for storing video frames, and means 
for calibrating positions in the frame grabber relative 
to positions within the display. Preferably, the cali- 
co brating means includes means for projecting a mark 
at four points in the display and carrying out said cal- 
ibration by means of a four point mapping, given by 
x' = c,x + Cay + (<jxy + c 4 
/ = Cgx + Cey + C/xy + Cs, 

45 where (x,y) is a point in the display (21) and (x'.y*) is 
a corresponding point in the video frame stored in the 
frame grabber. 

The processing means may further include 
means for determining whether the user is right- or 

so left-handed. 

The present invention further provides a method 
of generating documents, comprising: providing a 
work surface, means for displaying images on the 
work surface, and a camera focussed on the work 

55 surface, said camera generating video signals repre- 
senting in electronic form image information present 
within the field of view of the camera; recognising one 
or more manual operations relating to the image in- 
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formation which are executed by a user within the 
field of view of the camera; performing electronic op- 
erations, corresponding to said manual operation(s), 
on the electronic form to produce a modified electron- 
ic form; displaying, simultaneously with or subse- 
quent to said electronic operation performing step, 
images defined by said manual operation(s); and 
wherein said images defined by said manual opera- 
tions include an image of a newly created document. 

The manual operation(s) may include designat- 
ing a plurality of the extremities of a shape encom- 
passing said selected portion of text or image infor- 
mation. 

Preferably, the images defined by said manual 
operation(s) include an outline of, or a shaded area 
coincident with, said shape. Preferably, the shape is 
a rectangle. 

The manual operation(s) may include pointing 
with a plurality of fingers at the corners of said shape. 

Alternatively, said extremities are designated us- 
ing a stylus in association with a position sensing tab- 
let on the surface. 

The manual operation(s) may include designat- 
ing a text or image unit in the document by pointing a 
finger at it. The manual operation(s) may include des- 
ignating a successively larger text or image unit in the 
document by tapping on the surface. The manual op- 
erations) include confirming a text or image selection 
by tapping on the surface. The manual operation(s) 
may include copying the selected text or image to a 
location in a new document displayed on the surface 
by pointing at the selected text or image using a finger 
or stylus and dragging the finger or stylus across the 
surface to said location in the new document, and 
dropping the selected text or image at said location by 
tapping on the surface. The manual operation(s) may 
include changing the dimensions of selected text or 
image by changing the separation of finger tips of the 
user defining extremities of the selected text or im- 
age. The manual operation(s) include placing paper 
signs within the field of view of the camera, the signs 
defining operations to be performed on selected text 
or image information. 

The invention further provides a programmable 
copying apparatus when suitably programmed for 
carrying out the method of any of claims 4 to 7, or any 
of the above described particular embodiments. 

The present invention further provides a copying 
system according to claim 9 of the appended claims. 

The copying system may further including means 
for scanning the second document to generate an 
electronic version of the second document; wherein 
the processing means includes means for recognis- 
ing the positions of the transferred image information 
in said electronic version; the system further includ- 
ing means for printing said transferred image informa- 
tion on said second document. 

The present invention further provides an inter- 



active image reproduction system, comprising: a plur- 
ality of workstations interconnected by a communica- 
tions link, each workstation comprising a copying sys- 
tem according to any of claims 1 to 3, 8 or 9, each 
5 workstation being adapted for displaying the video 
output from the camera of the or each other worksta- 
tion. The system may further include an audio or vid- 
eoconferencing link between the workstations. 

Embodiments of the present invention will now 
10 be described, by way of example, with reference to 
the accompanying drawings, in which: 

Fig. 1 is a schematic diagram of a copying system 
according to the invention; 
Fig. 2 shows schematically a known imaging sys- 
15 tern into which the system of Fig. 1 may be incor- 

porated; 

Fig. 3 illustrates an image generated in the fin- 
ger-tracking technique employed in the present 
invention; 

20 Fig. 4 shows a four point mapping technique em- 

ployed by the present invention; 
Fig. 5 illustrates four ways to sweep out a selec- 
tion rectangle when using the present invention; 
Figs 6(a) to (f) show successive scenes of the 
25 desk surface in a copying operation according to 

one embodiment of the invention; 
Fig. 7 is a flow chart of the procedure of Fig. 6; 
Figs. 8(a) to (e) illustrate successive scenes of 
the desk surface in a copying operation accord- 
30 ing to a second embodiment of the invention; 

Fig. 9 is a flow chart of the procedure of Fig. 8; 
Figs. 10(a) to (h) show successive scenes of the 
desk surface in a copying operation according to 
a third embodiment of the invention; 
35 Fig. 11 is a flow chart of the procedure of Fig. 1 0; 

Fig. 12 shows a view from above of the desk sur- 
face during a copying operation according to a 
fourth embodiment of the invention; 
Fig. 1 3 is a flow chart of the procedure of Fig. 1 2; 
40 Fig. 14 shows a view from above of the desk sur- 

face during a copying operation according to a 
fifth embodiment of the invention; and 
Fig. 1 5 is a flow chart of the procedure of Fig. 14. 
Referring to Fig. 1, this illustrates schematically 
45 the copying system of the present invention. A flat 
desk surface 2 has placed on it a document 4 to be 
used as a source if textual or graphical information 
during manipulations which are described in detail 
below. The document 4 is located within the field of 
so view of a video camera 6 mounted above the desk 
surface 2. A video projector 8 is mounted adjacent the 
camera 6 and projects onto the surface 2 a display 21 
which is generally coincident with the field of view of 
the camera 6, and which, in the example shown, in- 
55 eludes an image of a newly created document 20, as 
discussed below. The camera 6 and the projector 8 
are both connected to a signal processing system, 
generally designated 10, which is in turn connected 
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to a printing device 208 and, optionally, a document 
scanner 206 (see Fig. 2). A small snare-drum micro- 
phone 16 (preferably with built in amplifier) is attach- 
ed to the bottom of the desk and picks up audible or 
vibrational signals. The system 10 monitors the (digi- 
tised) signal amplitude of the microphone 16 to deter- 
mine (e.g. by comparison with a threshold value) 
when the user taps on the desk 2 (e.g. to designate 
an operation (see below)). 

The architecture of the signal processing system 
10 is schematically illustrated in Fig. 1. This imple- 
mentation runs on standard X Window applications 
using the human finger as a pointing device. The sys- 
tem is implemented so that finger tip location and 
desk tapping information are sent through X in such 
away that from the point of view of applications, these 
events are indistinguishable from those of a conven- 
tionally-used mouse. The system runs on two ma- 
chines: a Sun 4/110 (104) and a SPARCstation (106). 
This is because the image processing board 102 
plugs into a VME bus, while the projected video (LCD) 
display plugs into an Sbus. The images captured by 
the camera 6 are initially processed by an Itex100 im- 
age processing board 102. Any other suitable archi- 
tecture could be used to achieve the image signal 
handling. Figure 1 illustrates how the software mod- 
ules interface to each other and to the hardware. The 
system is implemented in C + + and C under SunOS 
and TCP/IP. 

The desk-camera- projector arrangement (2,6,8) 
may be located remotely from the printing device 208 
and any number of such arrangements may be linked 
up to a common printer. Alternatively, the surface 2 
may itself constitute an upper surface of a copying or 
printing machine, or the surface of a desk next to such 
a machine, with the advantage that any documents 
created using the system may be immediately printed 
out and taken away by the user. The processor 10 
may form an integral part of a copying or printing ma- 
chine, or may be remotely located in a separate de- 
vice and coupled to the printer by a conventional com- 
munications link. 

In a preferred embodiment, the system of Fig. 1 
forms an integral part of a printing system, for exam- 
ple as schematically illustrated in Fig. 2 and descri- 
bed in detail in EP-A-592108, with the exception that 
appropriate elements of the control section 207 are 
replaced by hardware from Fig. 1 , such as the user in- 
terface 252 (which is implemented by the camera- 
projector arrangement (6,8)), the system control 254, 
etc. For additional control detail, reference is made to 
US-A-s 5,081,494, 5,091,971 and 4,686,542. 

Interacting with objects with bare fingers is facili- 
tated in the present invention through video-based 
finger-tracking. A bare finger is too thick, however, to 
indicate small objects such as a single letter, for which 
a pen or other thin object is used. 

The current implementation uses simple image 



processing hardware to achieve the desired interac- 
tive response time (although suitable algorithms 
could be used to achieve the same result): it initially 
subsamples the image and processes it at very low 

5 resolution to get an approximate location for the fin- 
ger. Only then does the system scale to its full reso- 
lution in order to get a precise location, so only small 
portions of the image need to be processed. If the 
user moves too quickly, the system loses track of 

w where the finger is, so it immediately zooms back out 
to find it. The result is that large, quick movements are 
followed less precisely than fine movements, but for 
pointing applications this seems acceptable. 

Interaction techniques using video-based finger 

15 tracking, are demonstrated by M. Krueger (Artificial 
Reality II, Addison-Wesley, 1991). The disadvantag- 
es of his system are discussed in UK patent applica- 
tion 931 3637.2 (hereafter Ref. 1 ), a copy of which was 
filed with the present application. 

20 In contrast, the present invention senses motion 

(since most objects on the desk 2 do not move except 
the user's hands and the objects they are holding): it 
captures sequential video frames and examines the 
image produced by subtracting the sequential values 

25 of each pixel in two successive frames. The result for, 
e.g., a moving hand looks like Figure 3. Further proc- 
essing is then carried out to remove noise and to lo- 
cate the precise position of the fingertips. 

Motion detection uses an image loop- back fea- 

30 ture of the board 1 02 that allows the most significant 
bits of two images to be sent through a look-up table. 
This table is set up to subtract the two images, allow- 
ing very fast differencing of successive frames. Cur- 
rent finger-tracking performance using the Sun 4/110 

35 and Itex100 image processing board 102 is 6-7 
frames/sec. 

Determining when the user taps on the desk is 
preferably achieved using the microphone 16. An- 
other way to detect tapping is to use a touch screen, 

40 which can provide dragging information as well as ex- 
tra location data. 

Projection from above provides similar capabili- 
ties to a large flat display screen, but it has the key 
advantage that computer-generated images 21 can 

45 be superimposed onto paper documents. This is nec- 
essary for creating merged paper and electronic 
documents, and for providing feedback when making 
selections 22, 28, 31 (see below) on paper. Overhead 
projection, however, does produce problems, such as 

so shadows: these are hardly noticed when the projector 
is mounted above a horizontal desk 2, but special 
measures must be taken to avoid shadow problems 
on a nearly vertical surface, if this is used as the work 
surface (like a drawing board). 

55 The brightness of the room may affect the clarity 

of the projected display. This is not a problem with 
normal fluorescent lights, but a bright desk lamp or di- 
rect sunlight should be avoided. The area onto which 
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display 21 is projected should preferably be white. 

In the implementations described herein the im- 
age output device may be any device that provides an 
image in the work surface: e.g. a CRT display con- 
veyed to the surface 2 by means of mirror elements 5 
above or below the surface 2; or a flat panel LCD dis- 
play integral with the desk and disposed either at the 
surface or below it. Any number of displays may be 
used. 

Document images are captured through an over- w 
head video camera 6, but a difficulty with standard 
video cameras is their low resolution (compared to 
scanners), which may not permit good quality copying 
from a document on the work surface. Several solu- 
tions are possible. 15 

One technique is to use a very high resolution 
camera 6 for the system. 

Another solution is to use multiple cameras. At 
least one camera 6 is set up with a wide field of view 
that encompasses the substantially the entire work 20 
surface 2. At least one other camera (hereafter- "sub- 
sidiary camera"; not shown) is mounted adjacent the 
main camera 6 and zoomed in to cover a small part 
of the desk surface 2 (within or outside the display 
area 21) at high resolution (e.g. about 200 spots/inch; 25 
8 spots/mm). Multiple fixed subsidiary cameras may 
be used to cover the whole area at high resolution, or 
fewer movable subsidiary cameras could be used. 
The video signal from each camera is processed via 
a respective channel of the image processing board 30 
102, by means of suitable multiplexing techniques 
which are well known in the art. When such a subsidi- 
ary camera with a relatively small field of view is used, 
a light area (e.g. a white "window" or other visual in- 
dication such as a black rectangular outline) is pro- 35 
jected onto the surface 2 so as to coincide with the 
field of view of the high resolution subsidiary cam- 
era^) and indicate to the user exactly what part of the 
work surface is within that field of view (i.e., the active 
area). The user can therefore place source docu- 40 
ments 4 within this high resolution "window" to enable 
text or image information to be scanned in by the sys- 
tem at high resolution. Since it has been found so far 
that only small parts of a document at a time need be 
used, and sliding a piece of paper into the camera's 45 
"window" is so easy, the use of multiple fixed cameras 
to cover the whole desk appears unnecessary. 

A further possible technique solves this problem 
by storing information about the positions in the 
source document 4 of the part(s) to be copied, by 50 
means of image recognition techniques and the use 
of document descriptors. The document is then put 
through a (desktop) scanner 206 (or is pre-scanned 
before image manipulation takes place), preferably a 
high resolution (e.g. 24 dots/mm; 600 dots/inch) 55 
scanning machine, and this position information is 
used to determine what parts of the scanned image 
to use in the eventual copy. With this technique, the 



user interacts with the documents at the lower (cam- 
era) resolution, but the finished product is construct- 
ed from higher (scanner) resolution images. Pre- 
scanning is inconvenient for many interactive applica- 
tions, however, so it is preferred to use one of the 
abovementioned alternative methods. 

The image produced from a video camera 6 and 
frame grabber (not shown) on the board 102 is grey- 
scale (typically eight bits/pixel). This grey-scale im- 
age must be thresholded, or converted to a one 
bit/pixel black and white image, before it can be used 
for character recognition or any of the other embodi- 
ments which are described herein. 

Simple thresholding is not adequate for obtaining 
an image suitable for character recognition. Another 
problem can be automatic grey balancing on the cam- 
era. This can cause a change in brightness in one part 
of the image to affect the values in all other parts. 

Consequently, the system uses an adaptive 
thresholding algorithm which varies the threshold val- 
ue across the image according to its background val- 
ue at each pixel. The present system produces results 
in a single pass which are nearly as good as systems 
requiring multiple passes through the image, by cal- 
culating the threshold value at each point from an es- 
timate of the background illumination based on a 
moving average of local (within about 1/8th the width 
of the image) pixel intensities. This method is fast and 
can be combined with a scaling operation if neces- 
sary. 

Finally, when dealing with text, the thresholded 
image is skew-corrected and recognised by an OCR 
server (in this case, Xerox Image System's Scan- 
WorkX). If the resolution is high enough relative to the 
text size, then it returns the associated ASCII string. 
For accuracy, it is important to provide both quick 
feedback to the user (by displaying immediately the 
number or character which the system "thinks" it has 
recognised), and a simple way for the user to correct 
unrecognised characters. 

To support interaction, projected feedback 24, 26 
(see Fig. 6) to the user, and selective grabbing of im- 
ages through the camera 6, the system must map co- 
ordinates in the projected display 21 to coordinates in 
the frame grabber of image processor board 102. 
This calibration may be difficult because the project- 
ed display 21 is not a perfect rectangle (there are opt- 
ical distortions such as "keystoning"), the camera 6 
and/or tablet may be rotated relative to the projected 
display 21 , and it may be necessary for the camera 6 
to view the projected display from an angle. Also, vi- 
brations caused by, e.g., air conditioners or slamming 
doors cause movements which disrupt the calibra- 
tion, as do any adjustments to the equipment. 

In the case where stylus input is used to indicate 
position on a tablet, the system first maps absolute 
positions on the digitising tablet to positions on the 
display 21 in order to provide feedback. Second, pos- 
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itions on the display 21 are mapped to corresponding 
positions in the frame grabber in order to support 
grabbing of selected areas 22, 28, 31 (see below) on 
the desk. Obtaining the data to calibrate the pointing 
device (stylus + tablet; touchscreen) to the display 21 
is relatively straightforward: a series of points are dis- 
played and the user is prompted to touch them with a 
pointer 32 (see Fig. 8). 

In the case where position indication by finger tip 
location is used, obtaining data for calibrating the vid- 
eo camera 6 to the display 21 is not as simple. 

Figure 4 shows an approach which improves on 
prior techniques: this is to project an object that can 
be located by the image processing system, allowing 
the system to self-calibrate without any assistance 
from the user. The present system projects a thick 
"plus" sign (+), and uses image morphology (see D. 
Bloomberg & P. Maragos, "Image Algebra and Mor- 
phological Image Processing", SPIE Conference 
Procs, San Diego, CA, July 1990) to pinpoint the cen- 
tre of the mark in the frame grabber coordinate space. 

For accurate mapping, preferably a four point cal- 
ibration system (Fig. 4) is used, which compensates 
for rotation and keystoning. The mapping is given by 
the equations 

x' = c^ + c 2 y + C3xy + c 4 (1) 
/ = C5X + c 6 y + c 7 xy + (2) 
where (x,y) are coordinates in the projected display, 
and (x\y') are coordinates in the frame grabber. 

With four point pairs, the set of simultaneous lin- 
ear equations can be quickly solved by Gaussian 
Elimination. Then, a fifth plus mark (+) is projected 
and its location is checked to make sure it is close 
enough to the position produced by the above map- 
ping. The result is accurate to within one or two dis- 
play pixels, allowing the user to select areas 22, 28, 
31 on the desk 2 and rely on the displayed feedback 
24 to precisely indicate what will appear in the grab- 
bed image. 

Unlike a traditional workstation, user interfaces 
on the present system must take account of handed- 
ness. If feedback 24 (see, e.g., Fig. 6) is projected to 
the lower left of the pointer 32 (finger, stylus), for ex- 
ample, then a right-handed person has no trouble 
seeing it, but a left-handed person does have trouble 
because it gets projected on the hand. Not only is 
feedback affected, but also the general layout of ap- 
plications, and left-handed users are inconvenienced 
because it requires them to reach their arm farther 
than right-handed subjects, and their arms hide the 
paper 4 they are reading. The system's video camera 
6 can see the user's hands, so it preferably recognis- 
es automatically which hand the user is pointing with, 
and then uses this information in implementing the in- 
terface during the following work session. A pop-up 
menu, for example, is preferably projected to the left 
of the pointer for a right-handed person, and to the 
right of the pointer for a left-handed person. 



When pointing at paper with the present system, 
the camera 6 must be able to "see" the paper 4, and 
this means that fingers and other pointing devices 32 
must be out of the way. However, new users do not 

5 seem to have much difficulty learning how to interact 
with the system in a way that keeps selections 
(22,28,31) visible. When sweeping out a rectangle 24 
(for example in the Fig, 6 embodiment described be- 
low) there are four ways of doing this (see Fig. 5). If 

10 right-handed people use method 0, or if left-handed 
people use method @, they obscure the selection. 
But users do not seem to repeat the mistake: in gen- 
eral, the system cannot see a selection (22, 28, 31) 
unless the user can see it too, and that seems easy 

15 for people to learn. 

Selection feedback 24 can also play an important 
role in preventing obscuration. In the implementation 
discussed herein, the projected selection rectangle 
24 floats slightly ahead of the pointer, so it is easy to 

20 avoid placing the pointer inside. 

Selecting parts of a document to be copied 

In Figs 6(a)-(f) a basic user interface technique 

25 made possible by the copying system of the present 
invention - the selection of parts of a paper document 
4 directly on the paper itself while the system reads 
the image selected - is illustrated in successive 
scenes, viewed from above the desk surface 2. The 

30 user 1 8 is creating a new document (generally desig- 
nated 20) within the projected display 21, and here 
the source document 4 is a book page. The user se- 
lects a figure 22 on the book page 4 by first touching 
his two index fingers together at the top right hand 

35 corner of the figure 22: the system recognises this as 
a gesture for starting a selection. As the user then 
moves his left hand index finger to the bottom left 
hand corner of the figure 22 (motion (§) in Fig.5), the 
system processor recognises the movement and 

40 causes the projector to display, as feedback to the 
user, a selection block 24 (here a rectangular outline; 
alternatively a grey rectangle) which increases in size 
until this movement ceases (Fig.6(a)). The user can 
see exactly what is encompassed by the selection 

45 block 24, and when this is as desired, the user taps 
on the desk to confirm the selection (the tap being in- 
terpreted by the processor as such confirmation). The 
processor obtains via the camera 6 information indi- 
cating the positions of the boundaries of the selection 

50 block 24 relative to the original document 4 and there- 
fore the extent of part of the document which has 
been selected. 

Next, the user puts his pointed finger on the page 
4 in the selection block 24 and "drags" a projected im- 

55 age of the selection block 24 by moving his finger 
across the display 21 (the selection block 24 is dis- 
played by the projector 8 for feedback and moves to 
follow the position of the moving finger tip), positions 
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it in the appropriate location in the document 20, and 
taps on the desk with a finger to confirm the position- 
ing (Fig. 6(b)). The figure 22 is captured by the cam- 
era 6, thresholded by the processor and the stored 
form of the document 20 edited accordingly; and the 5 
result is that the projected display is modified so that 
the figure 22 is "pasted" into the document 20 at the 
desired location (Fig. 6(c)); here, the dimensions of 
the pasted- in figure 22 are adapted by the processor 
10 to the available height and width of the text area 10 
of the new document 20. 

It is then possible for the user to add a legend to 
the pasted-in figure 22 by typing out the text thereof 
on a conventional keyboard (not shown) linked to the 
processor 1 0. The stored electronic form of the docu- 15 
ment 20 is edited by the processor, and the projected 
display 21 is simultaneously modified to show the leg- 
end 26 (Fig. 6(d)) as it is typed in. 

Next, the user selects a portion 28 of text from the 
book page 4 which is to be pasted in below the figure 20 
in document 20 (Fig. 6(e)). This is done in exactly the 
same way as selecting the figure 22 in Fig. 6(a), ex- 
cept that the user starts with both index fingers at the 
top left hand corner of the text portion 28 and moves 
his right hand index finger to the bottom right hand 25 
corner of the text portion 28 (motion <§) in Fig. 5). The 
text selection 28 is positioned in the document by tap- 
ping on the surface 2, as before. The difference in this 
case is that optical character recognition (OCR) is 
performed on the selected text portion 28 captured by 30 
the camera 6. The font of the text 28 is automatically 
converted into that of the rest of the electronic docu- 
ment 20, and is reformatted so as to fit into the text 
flow of the document 20 being made up. (Alternatively 
the text 28 could be treated in the same way as the 35 
figure 22 in Figs 6(a) to (c), by selecting from top right 
to bottom left: i.e. motion ® in Fig.5.) The stored elec- 
tronic form of the document 20 is updated accordingly 
by the processor 1 0 and the projector 8 automatically 
displays the modified document 20 (Fig.6(f)). 40 

Once completed, the document 20 can be printed 
out by conveying to the processor 10 a command to 
send the electronic version of the document 20 to the 
printer 14. This command may be entered via the key- 
board or conventional mouse operation, but is prefer- 45 
ably designated by the user selecting an appropriate 
item from a pull-down menu (not shown) accessable, 
e.g. by finger pointing, in the display area 21 on the 
surface 2. 

In an alternative implementation, the work sur- 50 
face 2 may incorporate a touch pad or other position 
sensing device; and the user can use an appropriate 
stylus to indicate corners of a rectangular selection 
block designating a part to be copied, such as by 
starting in one corner and moving the stylus to the op- 55 
posite corner. (In this case, in order for the stylus and 
position-sensing tablet to operate, the document 4 
must only be a single-sheet thick.) It is also possible 



to select non-rectangular regions by tracing a "lasso" 
around the part of the paper document to be copied. 
Another possibility is for the user to simply point at a 
region and the system can use image morphology 
techniques to determine the scope of the selection. 
One tap on the work surface could select only the 
smallest discernable element pointed to (e.g. a letter 
or word). Another tap in the same location would ex- 
pand the selection to include the sentence containing 
that letter or word, a further tap causing selection of 
the paragraph containing that sentence, or larger vis- 
ual unit, and so on. With all of these selection techni- 
ques, precise feedback is projected so the user can 
see exactly what is selected, and can therefore adjust 
the selection if it is not exactly what the user wants 
(e.g. selecting a "don't care" location - beyond the 
boundaries of the document 20 - whereupon the pro- 
jected selection is cancelled; and then re-selecting 
from the source document). 

Figure 7 illustrates, by means of a flow chart of 
appropriate software running in the signal processing 
system 10 of Fig. 1 , the steps involved in carrying out 
the procedure sequentially illustrated in Fig. 6. 

Copying onto marked document 

Another basic technique made possible by the 
present invention is the copying onto a previously 
marked document in novel ways. For example, a form 
can be filled in with data from parts of one or more 
other documents. 

This technique is illustrated in Figs 8(a) to (f), 
which show successive scenes, viewed from above 
the surface 2. The technique is similar to that used in 
the embodiment illustrated in Fig. 6, except that the 
document 20 consists of the information to be added 
to a marked document 30 (in this case a form) placed 
on the work surface 2. Operations are performed to 
indicate how the form 30 should be completed, pro- 
ducing a projected image showing the additional 
marks that are to be made on the document 

As illustrated in Fig. 8(a), the source document 4 
comprises a receipt positioned within the camera's 
field of view. The user selects the numerical total 31 
indicated on the receipt using the above-mentioned 
stylus and position-sensing tablet method (but any of 
the above-mentioned image-selection techniques 
could be used). An image of the selected number, 
captured by the camera 6, is projected back onto the 
display 21 at the position of the point of the stylus 32. 
As the projected image of selected number is dragged 
to the appropriate box 34 of the form 30, in a similar 
way to the moving selection block 24 in the Fig. 6 em- 
bodiment, the motion of the number is shown in the 
display by the projector 8 (Fig.8 (b)). The number is 
recognised by the processor 10 using OCR and drop- 
ped in the box 34 by releasing a button on the stylus, 
or by the user tapping on the desk with his free hand. 
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Figure 8(c) illustrates an operation performed in 
the case where the appropriate data is not present in 
the source document 4: the user writes a date in the 
appropriate box 36 of the form by hand. The move- 
ment of the point of the stylus 32 is tracked as the s 
user writes, and an image is simultaneously projected 
down onto the form 30 showing the ink which would 
have been left on the form if the stylus were a pen. 
The system recognises the user's characters as they 
are written, converts the projected "ink" characters 10 
into the same font as the other numbers on the form 
30 and modifies the projected characters to make 
them appear in that font. Once one entry (e.g. a date 
in numerical form) has been made in this way, it can 
be copied to other boxes in in the same or neighbour- 15 
ing column using the above-described drag and drop 
process, or even copied by making ditto signs by hand 
in the appropriate places. 

Once the relevant numbers have been "entered" 
in the form 30, an operation can be performed on a 20 
group of numbers. In Fig. 8(d) a column 38 containing 
a set of "entered" numbers is selected using the stylus 
32. Next, the user places on the form 30 a small piece 
of paper having a button 39 printed on it and desig- 
nated "SUM", with its arrow pointing at the interior of 25 
a box 40 on the form 30 in which a total is to be en- 
tered. When the user "presses" the paper button by 
tapping a finger on the piece of paper as shown in Fig. 
8(e), the sum of the numbers in the selected column 
38 is projected into the box 40. In doing this, the sys- do 
tern (1) recognises the function of the button (e.g. by 
means of morph glyphs present in the drop shadow 42 
of the button 39), (2) recognises the tapping of the 
button so as to be aware of where and when to per- 
form the summing operation, (3) carries out the sum- 35 
ming operation, and (4) projects the resulting numer- 
ical sum into the box 40. 

When all the necessary entries have been made 
in the form 30, the latter can be fed through a printer 
14 in order to make the projected marks permanent. 40 
[For this purpose it may be convenient, in the case 
where the main printer 14 is remote from the desk 2, 
to have an additional compact ink jet printer (not 
shown) on the desk surface 2, enabling the printing 
of the additional marks on the form and, if necessary, 45 
the signing of the form by a user, to be carried out im- 
mediately.] The processor 10, which stores the rela- 
tive positions of all the projected characters and num- 
bers with respect to the features of the form 30, caus- 
es the corresponding ink marks to be made in the ap- so 
propriate locations (row/column/box) in the form dur- 
ing the printing operation. 

Figure 9 illustrates, by means of a flow chart of 
appropriate software running in the signal processing 
system 10 of Fig. 1 , the steps involved in carrying out 55 
the procedure sequentially illustrated in Fig. 8. 

This technique is preferably extended by per- 
forming OCR on a selection from a source document 



4 and then projecting the recognised numbers or 
characters in image to fill in a form or other pre -mar- 
ked document. In general, text selections are discrim- 
inated from graphics selections and OCR performed, 
where necessary. Optionally, a user may select a lo- 
cation on a form 30 using one of the above-described 
techniques, and then type characters into the project- 
ed image with a conventional keyboard (not shown) 
linked to the processor 10. 

The above-mentioned paper buttons may also be 
used to extend this technique: various buttons dis- 
playing appropriate recognisable codes (as men- 
tioned above) are used to perform commonly execut- 
ed operations, such as currency conversions, averag- 
ing etc. 

This technique further includes recognising other 
tools and modifying the projected image appropriate- 
ly. When a stylus is used as a marking tool, and 
moved across the surface (e.g. for handwriting), a 
mark is produced in the image. If the resulting marks 
meet some criterion, recognition is performed and the 
marks replaced by the appropriate characters or num- 
bers. Also, an eraser is recognised similarly and, in 
addition to its physical erasure of marks, the project- 
ed image is modified appropriately. 

If a paper form is recognised by the system, then 
it can assist the user with prompts as to how to fill it 
out, and it can perform calculations (e.g. adding a col- 
umn of numbers) that are specified on the form. In 
general, the system can augment any recognisable 
paper form with features now available only with elec- 
tronic forms. 

Scaling and positioning document parts in 
projected image 

Another user interface technique made possible 
by the present invention is scaling or positioning parts 
of a document before copying. In this disclosure, the 
term "arrange" is used generally to include an opera- 
tion that scales (re-sizes) a document part, an oper- 
ation that positions a document part, or an operation 
that both scales and positions a document part. Pos- 
ition of a document part also includes orientation. The 
basic technique for arranging a document part is to 
perform arranging operations in the projected image. 
In other words, a user can provide signals through the 
camera requesting operations so that the document 
part appears at a different scale or different position 
in the projected image. In effect, the user changes the 
projected image until it shows a desired final, output 
document with the indicated document part scaled 
and positioned as desired. The output document can 
then be printed. 

To perform scaling, the user can indicate a differ- 
ent spacing between the opposite corners of a docu- 
ment part bounded by a selection rectangle, such as 
by moving the fingertips together or apart Scale of 
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the selected part in the projected image can then be 
changed in proportion to the change in spacing. 

To perform selection and positioning, the user 
proceeds as in the embodiments of Figs 6 and 8, ex- 
cept as mentioned below. 5 

Figure 10 illustrates this technique: successive 
scenes, viewed from above the surface 2, show how 
the invention is employed by a user in producing a 
sketch. 

Initially, the user sketches out a scene using an 10 
ordinary pencil and which includes a tree 46 (Fig. 
10(a)). Next, the user, desiring to create a row of 
trees, selects the image of the tree 46 by sweeping 
out a selection block 24 in the same manner as de- 
scribed with reference to the Fig. 6 embodiment (Fig. 15 
1 0(b)). The user then moves a copy of the selection 
block 24 as in the Fig. 6 embodiment, except that two 
fingers (or finger and thumb) are used which, when 
the copied block 24 is in the desired position, are used 
to reduces the size of the block to the desired scale 20 
(Fig. 1 0(c)). The user taps on the desk surface 2 to 
"drop" the reduced tree 48 in position, and the projec- 
tor displays the new tree 48 spaced apart from the 
original 46. This process is repeated three more times 
with the user's fingers progressively closer together, 25 
to produce a row of trees (46-54) in perspective along 
the lane 56 (Fig. 10(d)). 

Next, the user begins to draw some slates 58 on 
the roof 60 of the house 62 (Fig. 10(e)). In order to 
save time the user places a paper button 64 design at- 30 
ed "FILL" with its arrow pointing at the roof 60. The 
code in the drop shadow 66 of the button 64 is cap- 
tured by the camera 6 and the command is recog- 
nised by the processor 10; similarly, the slate pattern 
58 is captured when the user taps on the button 64. 35 
The user then moves the button 64 to the empty re- 
gion of the roof and taps on the button again; and the 
system executes the button command by replicating 
the slate pattern 58 to fill the area within the boundar- 
ies of the roof 60 (Fig. 10(f)). The resulting pattern is 40 
displayed by the projector 8. 

Afurtherstep is illustrated by Figs 10(g) and (h): 
the user decides to include a window 66 in the roof 60, 
so some of the slate pattern must be erased. An 
"eraser" 68 having on it a printed sticker 70 displaying 45 
a code, similar to that on the above-mentioned paper 
buttons (39, 64), by means of which the system rec- 
ognises the implement as an eraser. As the user 
sweeps out an area with the "eraser" 68, the erasing 
motion is recognised by the system, and the dis- so 
played pattern 58 is modified so as to omit the slate 
pattern from that area Fig. 10(g)). The user then 
draws in the dormer window 66 by hand Fig. 10(h)). 

The result of these operations is a merged phys- 
ical and electronically projected sketch similar to the 55 
combined form described in the Fig. 8 embodiment. 
Again, in order to make the projected marks (e.g. 
trees 48-54 and the slate pattern 58) permanent, the 



sheet 30 containing the sketch would be passed 
through a printer connected to the processor 10. The 
processor, which stores the relative positions of all 
the projected images with respect to either the fea- 
tures of the sketch or the boundaries of the sheet 30, 
causes the corresponding ink marks to be made in the 
appropriate locations on the sheet 30 during the print- 
ing operation. 

Figure 11 illustrates, by means of a flow chart of 
appropriate software running in the signal processing 
system 10 of Fig. 1 , the steps involved in carrying out 
the procedure sequentially illustrated in Fig. 10. 

Another possibility for rotating and positioning 
parts of a source document 4 in the projected image 
is to move a paper original, for example containing im- 
age elements to be included in the final sketch, into 
the desired position within the projected document 
20, and to select the image element of interest (e.g. 
by tapping on the surface 2) to be "pasted down" in 
place. This natural interaction technique allows any 
printed or hand-drawn image to be used as a sort of 
rubber stamp and advantageously allows the user to 
try an image element in various positions without 
having to produce a new complete sketch each time. 

Random copying of document parts 

The above basic techniques are especially pow- 
erful when considered together with the possibility of 
randomly copying from a set of input documents 4 to 
produce output documents. This user interface tech- 
nique is based on obtaining information indicating the 
relationship between the input documents 4 and the 
output documents. 

One way to obtain this information is to operate 
on the documents in sequence. In other words, the 
output documents include parts from input docu- 
ments 4 in sequence, so that the input documents can 
be copied in order into the output documents. This 
can be inconvenient, however, such as when one of 
the input documents 4 include different parts that are 
copied into several of the output documents. 

Another way to obtain this information without op- 
erating on the documents 4 in sequence is to use 
document recognition techniques. This is similar to 
the previous way except that it is unnecessary to pro- 
vide identifiers on the documents. Instead, a docu- 
ment characteristic that can be detected at low reso- 
lution, such as line length pattern, can be used to ob- 
tain identifiers that are very likely to be unique for 
each document. Document classification techniques 
are described briefly in EP-A-495 622. 

"Select and Paste" (or "Copy and Paste") 

Although selecting text or images from one docu- 
ment, and "pasting" the selection into a second docu- 
ment is now a standard feature when manipulating 
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electronic documents, the same operation is awk- 
ward to perform with real paper, requiring a photoco- 
pier, scissors, and some glue or tape. The system of 
the present invention, however, makes it possible to 
select and paste paper documents 4 in the same way 
that we select and paste electronic documents. In this 
implementation, a sketch 80 on paper 4 can be elec- 
tronically selected by sweeping out an area 24 of the 
paper (e.g. with a stylus 32) in a similar manner to that 
described above. When the stylus 32 is raised, the 
system snaps a picture of the selection 80, and the 
projected rectangle 24 is replaced by a thesholded 
electronic copy of the area. This copy can then be 
moved about and copied to other parts 82 of the paper 
4 as described in the aforementioned application. 
Sliding this electronically projected copy over the 
drawing to place it somewhere else is very similar to 
sliding a paper copy (see Fig. 12). 

In Fig. 12, the user has selected the sketch 80 of 
the window on the sheet 4, and has made two copies 
of it (82). Now he has moved and is about to "paste 
down" a copy 86 of the flower 84 that he drew. 

Figure 13 illustrates, by means of a flow chart of 
appropriate software running in the signal processing 
system 10 of Fig. 1, the steps involved in carrying out 
the procedure illustrated in Fig. 12. 

User testing revealed another way of using this 
tool which is also very powerful. Instead of construct- 
ing a mixed paper and projected drawing, it has been 
found that a user can construct a purely projected 
drawing from selected portions taken from any num- 
ber of their paper sketches. The user can sketch a fig- 
ure on paper, move it to the desired location in the pro- 
jected drawing, then select it using the above- 
mentioned techniques so that it remains "pasted 
down" in that location even after moving the paper 
away. The effect is like that of dry-transfer lettering 
or "rubber stamping", but in this case from any piece 
of paper onto a projected drawing. This interaction 
technique is quite different from the standard "copy 
and paste" found on most workstations and takes ad- 
vantage of unique qualities of the present invention: 
using both hands for manipulating and pointing as 
well as the superimposition of paper and electronic 
objects. 

Multi-user systems 

People often use documents when working to- 
gether, and they often need simultaneously to see 
and modify these documents. Physical paper is nor- 
mally constrained in that it cannot be written on, point- 
ed to, or otherwise manipulated by two people simul- 
taneously who are, for example, located on separate 
continents; but this constraint can also be addressed 
by the present invention. 

Shared editing of documents has been disclosed 
in, e.g. J.S. Olsen et al„ "Concurrent editing: the 



group's interface" in D. Daiper et al. (eds) Human 
Computer Interaction - Interact '90, pp. 835-840, 
Elsevier, Amsterdam). Most of this work has concen- 
trated on screen-based documents, but the multi- 

5 user implementation of the present invention makes 
it possible to share real paper documents. It allows 
users in (at least) two separate locations to "share" 
their physical desks, by enabling both users to see 
and to edit each other's paper documents 4. 

10 Referring to Fig. 1 4, in the case of a two-user sys- 

tem, the two processors 1 0 are connected by means 
of a conventional communications link. Each installa- 
tion continuously grabs images 88 from its local desk 
2 and projects thresholded images 90 from the re- 
ts mote desk 2\ The result is that both users see what 
is on both desks. When a paper document 4 is placed 
on a desk 2 of user A, it is projected onto desk 2' of 
user B and vice versa. The projections are digitally 
scaled and positioned to provide What You See Is 

20 What I See (WYSIWIS), and both users can draw (us- 
ing a real pen 92, 92') on either paper documents 4 
or on virtual documents. On both sides, the remote 
user B will see the new drawing projected in the cor- 
responding place. Hand motions are also transmitted 

25 over the communications link and displayed, so if a 
user points to a certain place on a document 4 the 
other user can see this. (The partner's hands block 
the view of what is underneath them, just as with an 
ordinary desk, so this must be dealt with through so- 

30 cial protocols and speech: not pictured in Fig. 12 is an 
audio link through telephones or speakerp hones 
which is preferably provided to facilitate this. Another 
useful and even more preferable addition is a face-to- 
face audio-visual link.) 

35 In Fig. 14, the local user A is drawing a "X" 88 on 

a paper sheet 4 in ink, while the remote user's (B) pa- 
per and hand can be seen having just finished draw- 
ing a "O" 90. 

Figure 15 illustrates, by means of a flow chart of 

40 appropriate software running in the signal processing 
system 1 0 of Fig 1 , the steps involved in carrying out 
the procedure illustrated in Fig. 14. 



45 Claims 

1. A copying system, comprising: 
a work surface (2); 

means (8) for displaying images on the 

so work surface; 

a camera (6), focussed on the work sur- 
face, for generating video signals representing in 
electronic form image information present within 
the field of view of the camera; 

55 processing means (10) for recognising 

one or more manual operations relating to the im- 
age information which are executed by a user 
within the field of view of the camera, and for per- 
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forming electronic operations, corresponding to 
said manual operation(s), on the electronic form 
to produce a modified electronic form; 

the displaying means (8) being adapted to 
display, under the control of the processing 5 
means (10), simultaneously with or subsequent 
to said electronic operation performing step, im- 
ages defined by said manual operations; and 

wherein said images defined by said man- 
ual operations include an image of the newly ere- w 
ated document. 

2. The copying system according to claim 1, where- 
in the modified electronic form includes an elec- 
tronic version of a newly created document. 15 

3. The copying system according to claim 2, further 
including means (208) for printing a document 
corresponding to at least part of said modified 
electronic form. 20 

4. A method of generating documents, comprising: 

providing a work surface, means for dis- 
playing images on the work surface, and a cam- 
era focussed on the work surface, said camera 25 
generating video signals representing in elec- 
tronic form image information present within the 
field of view of the camera; 

recognising one or more manual opera- 
tions relating to the image information which are 30 
executed by a user within the field of view of the 
camera; 

performing electronic operations, corre- 
sponding to said manual operation(s), on the 
electronic form to produce a modified electronic 35 
form; 

displaying, simultaneously with or subse- 
quent to said electronic operation performing 
step, images defined by said manual opera- 
tions); and 40 

wherein said images defined by said man- 
ual operations includes an image of the newly 
created document 



8. A programmable printing apparatus when suit- 
ably programmed for carrying out the method of 
any of claims 4 to 7. 

9. A copying system, comprising: 

a work surface (2); 

means (8) for displaying images on the 
work surface; 

a camera (6), focussed on the work sur- 
face, for generating video signals representing in 
electronic form image information present in at 
least first and second documents within the field 
of view of the camera; 

processing means (10) for recognising 
one or more manual operation(s) which are exe- 
cuted by a user within the field of view of the cam- 
era and represent the transfer of image informa- 
tion from the first document to the second docu- 
ment, and for performing electronic operations, 
corresponding to said manual operation^), on 
the electronic form of said second document to 
produce a modified electronic form; 

the displaying means (8) being adapted to 
display, under the control of the processing 
means (10), simultaneously with or subsequent 
to said electronic operation performing step, im- 
ages defined by said manual operations. 

10. An interactive image reproduction system, com- 
prising: 

a plurality of workstations interconnected 
by a communications link, each workstation com- 
prising a system according to any of claims 1 to 
3, 8 or 9, each workstation being adapted for dis- 
playing the video output from the camera (6) of 
the or each other workstation. 



5. The method according to claim 4 t wherein the 45 
modified electronic form includes an electronic 
version of a newly created document, 

6. The method according to claim 4 or 5, further in- 
cluding the step of supplying to a printing device 50 
said electronic version; and printing out said new- 
ly created document. 

7. The method according to claim 6, wherein said 
manual operation(s) include selecting a portion of 55 
text or image information in a document (4) locat- 
ed within the field of view of the camera. 
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